361 research outputs found
Incorporating Knowledge into Document Summarisation: an Application of Prefix-Tuning on GPT-2
Despite the great development of document summarisation techniques nowadays,
factual inconsistencies between the generated summaries and the original texts
still occur from time to time. This study explores the possibility of adopting
prompts to incorporate factual knowledge into generated summaries. We
specifically study prefix-tuning that uses a set of trainable continuous prefix
prompts together with discrete natural language prompts to aid summary
generation. Experimental results demonstrate that the trainable prefixes can
help the summarisation model extract information from discrete prompts
precisely, thus generating knowledge-preserving summaries that are factually
consistent with the discrete prompts. The ROUGE improvements of the generated
summaries indicate that explicitly adding factual knowledge into the
summarisation process could boost the overall performance, showing great
potential for applying it to other natural language processing tasks
Identifying Domains and Concepts in Short Texts via Partial Taxonomy and Unlabeled Data
Accurate and real-time identification of domains and concepts discussed in microblogging texts is crucial for many important applications such as earthquake monitoring, influenza surveillance and disaster management. Existing techniques such as machine learning and keyword generation are application specific and require significant amount of training in order to achieve high accuracy. In this paper, we propose to use a multiple domain taxonomy (MDT) to capture general user knowledge. We formally define the problems of domain classification and concept tagging. Using the MDT, we devise domain-independent pure frequency count methods that do not require any training data nor annotations and that are not sensitive to misspellings or shortened word forms. Our extensive experimental analysis on real Twitter data shows that both methods have significantly better identification accuracy with low runtime than existing methods for large datasets
Learning to Select the Relevant History Turns in Conversational Question Answering
The increasing demand for the web-based digital assistants has given a rapid
rise in the interest of the Information Retrieval (IR) community towards the
field of conversational question answering (ConvQA). However, one of the
critical aspects of ConvQA is the effective selection of conversational history
turns to answer the question at hand. The dependency between relevant history
selection and correct answer prediction is an intriguing but under-explored
area. The selected relevant context can better guide the system so as to where
exactly in the passage to look for an answer. Irrelevant context, on the other
hand, brings noise to the system, thereby resulting in a decline in the model's
performance. In this paper, we propose a framework, DHS-ConvQA (Dynamic History
Selection in Conversational Question Answering), that first generates the
context and question entities for all the history turns, which are then pruned
on the basis of similarity they share in common with the question at hand. We
also propose an attention-based mechanism to re-rank the pruned terms based on
their calculated weights of how useful they are in answering the question. In
the end, we further aid the model by highlighting the terms in the re-ranked
conversational history using a binary classification task and keeping the
useful terms (predicted as 1) and ignoring the irrelevant terms (predicted as
0). We demonstrate the efficacy of our proposed framework with extensive
experimental results on CANARD and QuAC -- the two popularly utilized datasets
in ConvQA. We demonstrate that selecting relevant turns works better than
rewriting the original question. We also investigate how adding the irrelevant
history turns negatively impacts the model's performance and discuss the
research challenges that demand more attention from the IR community
SWAP: Exploiting Second-Ranked Logits for Adversarial Attacks on Time Series
Time series classification (TSC) has emerged as a critical task in various
domains, and deep neural models have shown superior performance in TSC tasks.
However, these models are vulnerable to adversarial attacks, where subtle
perturbations can significantly impact the prediction results. Existing
adversarial methods often suffer from over-parameterization or random logit
perturbation, hindering their effectiveness. Additionally, increasing the
attack success rate (ASR) typically involves generating more noise, making the
attack more easily detectable. To address these limitations, we propose SWAP, a
novel attacking method for TSC models. SWAP focuses on enhancing the confidence
of the second-ranked logits while minimizing the manipulation of other logits.
This is achieved by minimizing the Kullback-Leibler divergence between the
target logit distribution and the predictive logit distribution. Experimental
results demonstrate that SWAP achieves state-of-the-art performance, with an
ASR exceeding 50% and an 18% increase compared to existing methods.Comment: 10 pages, 8 figure
- …